Poisson-Dirichlet distribution

From HandWiki
Short description: Definition and first properties of the Poisson-Dirichlet distributions

In probability theory, a branch of mathematics Poisson-Dirichlet distributions are probability distributions on the set of nonnegative, non-decreasing sequences with sum 1, depending on two parameters [math]\displaystyle{ \alpha \in [0,1) }[/math] and [math]\displaystyle{ \theta \in (-\alpha, \infty) }[/math]. It can be defined as follows. One considers independent random variables [math]\displaystyle{ (Y_n)_{n \geq 1} }[/math] such that [math]\displaystyle{ Y_n }[/math] follows the beta distribution of parameters [math]\displaystyle{ 1-\alpha }[/math] and [math]\displaystyle{ \theta+n \alpha }[/math]. Then, the Poisson-Dirichlet distribution [math]\displaystyle{ PD(\alpha, \theta) }[/math] of parameters [math]\displaystyle{ \alpha }[/math] and [math]\displaystyle{ \theta }[/math] is the law of the random decreasing sequence containing [math]\displaystyle{ Y_1 }[/math] and the products [math]\displaystyle{ Y_n \prod_{k=1}^{n-1}(1-Y_k) }[/math]. This definition is due to Jim Pitman and Marc Yor.[1][2] It generalizes Kingman's law, which corresponds to the particular case [math]\displaystyle{ \alpha = 0 }[/math].[3]

Number theory

Patrick Billingsley[4] has proven the following result: if [math]\displaystyle{ n }[/math] is a uniform random integer in [math]\displaystyle{ \{2,3,\dots,N\} }[/math], if [math]\displaystyle{ k \geq 1 }[/math] is a fixed integer, and if [math]\displaystyle{ p_1 \geq p_2 \geq \dots \geq p_k }[/math] are the [math]\displaystyle{ k }[/math] largest prime divisors of [math]\displaystyle{ n }[/math] (with [math]\displaystyle{ p_j }[/math] arbitrarily defined if [math]\displaystyle{ n }[/math] has less than [math]\displaystyle{ j }[/math] prime factors), then the joint distribution of[math]\displaystyle{ (\log p_1/\log n, \log p_2/\log n, \dots, \log p_k/\log n) }[/math]converges to the law of the [math]\displaystyle{ k }[/math] first elements of a [math]\displaystyle{ PD(0,1) }[/math] distributed random sequence, when [math]\displaystyle{ N }[/math] goes to infinity.

Random permutations and Ewens's sampling formula

The Poisson-Dirichlet distribution of parameters [math]\displaystyle{ \alpha = 0 }[/math] and [math]\displaystyle{ \theta = 1 }[/math] is also the limiting distribution, for [math]\displaystyle{ N }[/math] going to infinity, of the sequence [math]\displaystyle{ (\ell_1/N, \ell_2/N, \ell_3/N, \dots) }[/math], where [math]\displaystyle{ \ell_j }[/math] is the length of the [math]\displaystyle{ j^{\operatorname{th}} }[/math] largest cycle of a uniformly distributed permutation of order [math]\displaystyle{ N }[/math]. If for [math]\displaystyle{ \theta \gt 0 }[/math], one replaces the uniform distribution by the distribution [math]\displaystyle{ \mathbb{P}_{N, \theta} }[/math] on [math]\displaystyle{ \mathfrak{S}_N }[/math] such that [math]\displaystyle{ \mathbb{P}_{N, \theta} (\sigma) = \frac{\theta^{n(\sigma)}}{\theta (\theta+ 1) \dots (\theta + n-1)} }[/math], where [math]\displaystyle{ n(\sigma) }[/math] is the number of cycles of the permutation [math]\displaystyle{ \sigma }[/math], then we get the Poisson-Dirichlet distribution of parameters [math]\displaystyle{ \alpha = 0 }[/math] and [math]\displaystyle{ \theta }[/math]. The probability distribution [math]\displaystyle{ \mathbb{P}_{N, \theta} }[/math] is called Ewens's distribution,[5] and comes from the Ewens's sampling formula, first introduced by Warren Ewens in population genetics, in order to describe the probabilities associated with counts of how many different alleles are observed a given number of times in the sample.

References

  1. Pitman, Jim; Yor, Marc (1997). "The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator". Annals of Probability 25 (2): 855–900. doi:10.1214/aop/1024404422. 
  2. Bourgade, Paul. "Lois de Poisson–Dirichlet". Master thesis. 
  3. Kingman, J. F. C. (1975). "Random discrete distributions". J. Roy. Statist. Soc. Ser. B 37: 1–22. 
  4. Billingsley, P. (1972). "On the distribution of large prime divisors". Periodica Mathematica 2: 283–289. 
  5. Ewens, Warren (1972). "The sampling theory of selectively neutral alleles". Theoretical Population Biology 3: 87–112.